An Economic Argument for Open Data by Efraim Feinstein

Contributor(s):

Efraim Feinstein

Shared on:

9 February 2010 under the Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 International copyleft license

Categories:

Advocacy

Tags:

free culture, what is free, economics, semantic data, open-source

There are two principles on which the success of data on the contemporary web rests: the web makes content available, and it adds value to that content by linking it to other related information.

When considering bringing old content online, both of these aspects are important. A first level of digitization involves simply making data available. Google Books and Hebrewbooks.org work at this level, providing PDFs and/or OCR-ed transcriptions of the material. A second level of digitization involves semantic linkage of the data, both internal to the site and external to the site. The Open Siddur Project and Open Scriptures digitize at the semantic level. This second-level digitization is required to do all of the cool things we expect to be able to do with online texts: click on a word and find its definition or grammatical form, find the source of a passage in one text in another text, find how the text has evolved historically, etc. Even the simplest form of a link: a reference from another site, requires some kind of internal division.

Digitization that takes advantage of the web therefore requires a number of steps: (1) getting the basic text online, (2) getting it in an addressable form (to make it more like typed text, instead of a picture of a page), (3) assuring the text’s accuracy, and (4) marking it up for semantic linkage. Some of these steps, or parts of them can be done automatically, but, overall, they require some degree of intelligent input. Even step 1, which is primarily mechanical in nature, requires design of the procedures.

I hope that this outline of the required steps to getting a text online suggests that the most expensive part of making content available is human labor — it takes time to do it, and it takes even more time to do it right.

And now for the rhetorical questions:

How many times has the Tanach been digitized?
… the siddur?
… the Talmud?
… major commentaries on the siddur, Torah, Talmud (Rashi, Tosefot)?
… full codes of Jewish law (Mishneh Torah, Tur, Shulchan Aruch, Aruch Hashulchan)?
… uncommon piyyutim (liturgical poems)?

In some cases, the answer is: it’s been done many times. In other cases, the answer is: it’s never been done. And, both answers lead the all-important question: why? Why are there so many digitizations of the Tanach and no full digitizations of Shulchan Aruch online? Why isn’t the siddur already hyperlinked to its Talmudic sources?

I would propose that we have been wasteful with our resources. Earlier, I pointed out that the primary resources that go into these advanced digitizations are time and human labor. In some cases, these resources equate directly to money, in others, the linkage is more indirect.

The core material of all of the above-mentioned works comes from the public domain. It is ownerless, and free for anyone to copy for any purpose. Every time we encounter a basic text that we have to digitize again because of “new copyright” claims or EULA-style contractual constraints, that is an indication of a failure somewhere in the system. This is particularly true if the claims are being made by non-profits, “social” businesses, or academic institutions. In the Jewish world, even for-profit published books are sometimes donation-supported. Each common text that has to be digitized a second, third, or hundredth time equates to another less common text that is not being digitized. Redoing basic OCR work and transcription takes resources away from establishing semantic linkages.

Some people and organizations get it. As of now, we only need one digitization of the Leningrad Codex (Masoretic Bible). That’s because Christopher Kimball and the J. Alan Groves Center for Advanced Biblical Research digitized it, transcribed it, and released it as free data. The Westminster Leningrad Codex is now perhaps the most built-off version of the Hebrew Bible online. The base texts (which may be used “without restriction”) are present in both commercial and non-commercial products. The Open Siddur Project is using it both for its technology demonstrations and as the basis of all biblical texts in the siddur.

There are precious few examples of free data in the Jewish community, even on the Internet. There are copious examples of donation-funded organizations presenting primarily public domain data with new copyright claims.

Free data prevents the necessity of duplication of effort, which, in turn, prevents the community as a whole from unnecessarily wasteful spending. Particularly for organizations with a social mission, its use is a win for everyone.

. Creative Commons Attribution-ShareAlike . 4.0 . International .

“An Economic Argument for Open Data by Efraim Feinstein” is shared through the Open Siddur Project with a Creative Commons Attribution-ShareAlike 4.0 International copyleft license.

Efraim Feinstein

Efraim Feinstein is the lead developer of the Open Siddur web application.

Pocket Pinterest Reddit 𝕏 Facebook Telegram WhatsApp SMS Email

Works of related interest:

Jewish Content, Free Culture and “Content Compatibility” by Efraim Feinstein

Openness, remixability, and free culture (Efraim Feinstein, 2010)

SHARE WHAT YOU LOVE ♡ A Decision Tree for Choosing Free-Culture Compatible Open Content Licenses for Cultural & Technological Work

Which of the above Creative Commons licensing option conflicts with the entire copyleft and free/libre license ecosystem?

prev next

1 comment to An Economic Argument for Open Data by Efraim Feinstein

“An Economic Argument for Free Primary Data” | Open Scriptures
2010-02-10 at 3:27 am · Reply.
[…] Efraim Feinstein wrote an excellent blog post on Open Siddur Project Development Blog on “An Economic Argument for Free Primary Data”. Here’re the introductory […]

Comments, Corrections, and QueriesCancel reply

Stable Link: https://opensiddur.org/?p=359

Associated Image:

(This image is set to automatically show as the "featured image" in shared links on social media.)

Source Data: XML | JSON

Re-formatted: HTML | ODT

Terms of Use: Be a mentsch (a conscientious, considerate person) and adhere to the following guidelines:

Properly attribute the work to Efraim Feinstein.
Clearly indicate the date you accessed the work and in what ways, if any, you modified it. (If you have adapted the work, let us know so that the contributor might consider endorsing your revision.)
Provide the stable link to this resource: <https://opensiddur.org/?p=359>.
Indicate that the original work was shared under the Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 International copyleft license. (To redistribute or remix this work in any format, modified or unmodified, you must refer to the terms of the license under which the work is shared.)

Additional Notes:

The views expressed in this work represent the views of their creator(s) and do not necessarily represent the views of the Open Siddur Project's developers, its diverse community of contributors, patrons, or institutional partners.
We strongly advise against printing sacred texts and art containing divine names as these copies must be regarded with reverence, complicating their casual treatment and disposal.
If you must dispose of a printed sacred text (one containing Divine Names), please locate the closest genizah (often established by a synagogue) and contact its custodians for further instructions. We also recommend using Morah Yehudis Fishman's Prayer for Adding a Work to the Genizah.

Support this work: The Open Siddur Project is a volunteer-driven, non-profit, non-denominational, non-prescriptive, gratis & libre Open Access archive of contemplative praxes, liturgical readings, and Jewish prayer literature (historic and contemporary, familiar and obscure) composed in every era, region, and language Jews have ever prayed. Our goal is to provide a platform for sharing open-source resources, tools, and content for individuals and communities crafting their own prayerbook (siddur). Through this we hope to empower personal autonomy, preserve customs, and foster creativity in religious culture. If you like what you've found here, please help keep our project alive and online with your financial contribution.

ויהי נעם אדני אלהינו עלינו ומעשה ידינו כוננה עלינו ומעשה ידינו כוננהו "May the pleasantness of אדֹני our elo’ah be upon us; may our handiwork be established for us — our handiwork, may it be established." –Psalms 90:17